13 research outputs found

    PoLyScriber: Integrated Training of Extractor and Lyrics Transcriber for Polyphonic Music

    Full text link
    Lyrics transcription of polyphonic music is challenging as the background music affects lyrics intelligibility. Typically, lyrics transcription can be performed by a two step pipeline, i.e. singing vocal extraction frontend, followed by a lyrics transcriber backend, where the frontend and backend are trained separately. Such a two step pipeline suffers from both imperfect vocal extraction and mismatch between frontend and backend. In this work, we propose a novel end-to-end integrated training framework, that we call PoLyScriber, to globally optimize the vocal extractor front-end and lyrics transcriber backend for lyrics transcription in polyphonic music. The experimental results show that our proposed integrated training model achieves substantial improvements over the existing approaches on publicly available test datasets.Comment: 13 page

    Emotion Regulation Training Companion: A Design Science Research

    Get PDF
    Emotion regulation ability has a critical role in our physical and mental well-being. Strong emotion regulation capability is associated with academic success and life satisfaction. This capability also bolsters people’s psychological strength to bounce back from negative, challenging emotional experiences. Although interest in designing emotion regulation support technology has grown significantly in the past few years, much of the effort has been put into helping users to change their physiological responses to emotional episodes. Limited works have focused on designing a solution that equips users with the necessary “tools” that enable them to master their own emotions. Informed by theories in affective neuroscience, we propose to design a conversational agent that helps users regulate their emotions by identifying their true cause, labeling them with a wide range of emotion words, and gradually building up their emotional granularity. By acquiring more fine-grained emotion concepts, we expect users to have more clarity and flexibility in navigating their emotional experiences

    Computational Music Systems for Emotional Health and Wellbeing: A Review

    Get PDF
    Music is a powerful stimulus, and both active and receptive methods of engaging with music provide affordances for improving physical, mental and social health. The emergence of sophisticated computational methods also underscores the potential for novel music technologies to address a wider range of wellbeing outcomes. In this review, we focus on describing the current state of the literature on computational approaches to music generation for health and wellbeing and identifying possible future directions for research in this area

    VR.net: A Real-world Dataset for Virtual Reality Motion Sickness Research

    Full text link
    Researchers have used machine learning approaches to identify motion sickness in VR experience. These approaches demand an accurately-labeled, real-world, and diverse dataset for high accuracy and generalizability. As a starting point to address this need, we introduce `VR.net', a dataset offering approximately 12-hour gameplay videos from ten real-world games in 10 diverse genres. For each video frame, a rich set of motion sickness-related labels, such as camera/object movement, depth field, and motion flow, are accurately assigned. Building such a dataset is challenging since manual labeling would require an infeasible amount of time. Instead, we utilize a tool to automatically and precisely extract ground truth data from 3D engines' rendering pipelines without accessing VR games' source code. We illustrate the utility of VR.net through several applications, such as risk factor detection and sickness level prediction. We continuously expand VR.net and envision its next version offering 10X more data than the current form. We believe that the scale, accuracy, and diversity of VR.net can offer unparalleled opportunities for VR motion sickness research and beyond

    COMPREHENSIVE EVALUATION OF SINGING QUALITY

    No full text
    Ph.DDOCTOR OF PHILOSOPHY (NGS

    IPA-Open access -Distributed under Creative Commons Attribution License 2.0 Environmental noise assessment and its effect on human health in an urban area

    No full text
    ABSTRACT Traffic noise is a major environmental source of pollution in the whole planet, both in developed and in developing nations. The present study focuses on the traffic noise assessment and its negative health effect on road side residents. Five different locations were selected along a National Highway of Burdwan having a day time L eq level of 60 to 89.5 dBA. Evaluation of various noise descriptors such as L 10 , L 50 , L 90 , L eq , L NP and TNI showed that people of the study area got suffered from slight uncomfortable feeling to a position of noise annoyance. Assessment of health effects among the 52 peoples of 10 families residing in the study areas for long time was conducted through a questionnaire based survey. Responses from the people were collected for analysis and the outcome revealed that 53%, 36%, 40% of people were suffered from headache, anxiety and high blood pressure whereas 36%, 15%, 67% and 61% of people were suffered from hearing disability, cardiovascular diseases, irritability and insomnia respectively. Chi-Square test was conducted among the different physiological and psychological effects and it was found that noise has a significant (α = 0.05) effect on hearing loss, sleep disturbances, abnormal heart beat and speech communication problem

    Automatic Leaderboard: Evaluation of Singing Quality Without a Standard Reference

    No full text
    10.1109/TASLP.2019.2947737IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING2813-2

    Using music technology to motivate foreign language learning

    No full text
    Music is a fun and engaging form of entertainment and is often used by teachers to help students learn languages. In this paper, we describe how recent advances in music technology can be used to develop language learning applications that might help children, young adults, and adult learners grow their vocabularies, improve their pronunciation, and increase their cultural appreciation. We describe two apps that are under development: A karaoke app and a personalized radio app. Our goal is to provide teachers and students with new tools that are engaging, promote joyful learning, improve foreign language learning and mother tongue retention
    corecore